conversational interaction
ChatGPT on the Road: Leveraging Large Language Model-Powered In-vehicle Conversational Agents for Safer and More Enjoyable Driving Experience
Bond, Yeana Lee, Choe, Mungyeong, Hasan, Baker Kasim, Siddiqui, Arsh, Jeon, Myounghoon
Studies on in-vehicle conversational agents have traditionally relied on pre-scripted prompts or limited voice commands, constraining natural driver-agent interaction. To resolve this issue, the present study explored the potential of a ChatGPT-based in-vehicle agent capable of carrying continuous, multi-turn dialogues. Forty drivers participated in our experiment using a motion-based driving simulator, comparing three conditions (No agent, Pre-scripted agent, and ChatGPT-based agent) as a within-subjects variable. Results showed that the ChatGPT-based agent condition led to more stable driving performance across multiple metrics. Participants demonstrated lower variability in longitudinal acceleration, lateral acceleration, and lane deviation compared to the other two conditions. In subjective evaluations, the ChatGPT-based agent also received significantly higher ratings in competence, animacy, affective trust, and preference compared to the Pre-scripted agent. Our thematic analysis of driver-agent conversations revealed diverse interaction patterns in topics, including driving assistance/questions, entertainment requests, and anthropomorphic interactions. Our results highlight the potential of LLM-powered in-vehicle conversational agents to enhance driving safety and user experience through natural, context-rich interactions.
Large Language Models Will Change The Way Children Think About Technology And Impact Every Interaction Paradigm
It is a call for education and research in this space so that we can harness this irresistible force for more good than harm, and provides some early themes for designers to consider. We firstly discuss where and how LLMs have been used in school educational settings, and then explore the new opportunities that recently released models offer. A small-scale investigation reveals potentially large impacts on how children learn, and we highlight key things that we as a community need to be aware of. 2 A SIMPLE GUIDE TO LARGE LANGUAGE MODELS Large Language Models -- think ChatGPT, Gemini, GPT-3, CoPilot -- are immense deep learning neural networks with exceptional numbers of parameters, which are trained to pre dict sequences of words, having been trained on most of the contents of the Internet. If I asked you to complete the sentence Twinkle, twinkle, little star, how I wonder what you ..... it is quite likely that, if you have been brought up in a Wester n culture, you will recognise the nursery rhyme and complete the line with .....are LLMs do this, but on a massive scale. As the LLM has processed m uch of what has ever been written, it has ingested a large number of sequences of words, and compresses them to c reate an internal representation. An LLM can be seen as the JPEG of the web -- it is a lossy compressed version of the internet.
Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection
Yang, Bo, Guo, Jiaxian, Iwasawa, Yusuke, Matsuo, Yutaka
Recent studies have increasingly demonstrated that large language models (LLMs) possess significant theory of mind (ToM) capabilities, showing the potential for simulating the tracking of mental states in generative agents. In this study, we propose a novel paradigm called ToM-agent, designed to empower LLMs-based generative agents to simulate ToM in open-domain conversational interactions. ToM-agent disentangles the confidence from mental states, facilitating the emulation of an agent's perception of its counterpart's mental states, such as beliefs, desires, and intentions (BDIs). Using past conversation history and verbal reflections, ToM-Agent can dynamically adjust counterparts' inferred BDIs, along with related confidence levels. We further put forth a counterfactual intervention method that reflects on the gap between the predicted responses of counterparts and their real utterances, thereby enhancing the efficiency of reflection. Leveraging empathetic and persuasion dialogue datasets, we assess the advantages of implementing the ToM-agent with downstream tasks, as well as its performance in both the first-order and the \textit{second-order} ToM. Our findings indicate that the ToM-agent can grasp the underlying reasons for their counterpart's behaviors beyond mere semantic-emotional supporting or decision-making based on common sense, providing new insights for studying large-scale LLMs-based simulation of human social behaviors.
Revisiting the Phenomenon of Syntactic Complexity Convergence on German Dialogue Data
We revisit the phenomenon of syntactic complexity convergence in conversational interaction, originally found for English dialogue, which has theoretical implication for dialogical concepts such as mutual understanding. We use a modified metric to quantify syntactic complexity based on dependency parsing. The results show that syntactic complexity convergence can be statistically confirmed in one of three selected German datasets that were analysed. Given that the dataset which shows such convergence is much larger than the other two selected datasets, the empirical results indicate a certain degree of linguistic generality of syntactic complexity convergence in conversational interaction. We also found a different type of syntactic complexity convergence in one of the datasets while further investigation is still necessary.
OpenAI takes on Google: Microsoft-backed tech giant launches an AI search tool dubbed SearchGPT
Google executives may be fearing the worst once again as Microsoft-backed rival OpenAI launches a new AI-powered search tool. 'SearchGPT', which is being trialed as a prototype before a wider rollout, scours the web for live news and information just like Google Search. OpenAI says the new product is particularly useful for queries about current events, recent developments, or specific information that ChatGPT might not know. Social media users have noted the parallels with the world's biggest search engine, with one saying'Google Search is definitely in trouble'. Another said: 'Anyone who has been paying attention knows there will be a new king of search within 10 years.
Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions
Liu, Zhengyuan, Yin, Stella Xin, Lee, Carolyn, Chen, Nancy F.
Intelligent tutoring systems (ITSs) that imitate human tutors and aim to provide immediate and customized instructions or feedback to learners have shown their effectiveness in education. With the emergence of generative artificial intelligence, large language models (LLMs) further entitle the systems to complex and coherent conversational interactions. These systems would be of great help in language education as it involves developing skills in communication, which, however, drew relatively less attention. Additionally, due to the complicated cognitive development at younger ages, more endeavors are needed for practical uses. Scaffolding refers to a teaching technique where teachers provide support and guidance to students for learning and developing new concepts or skills. It is an effective way to support diverse learning needs, goals, processes, and outcomes. In this work, we investigate how pedagogical instructions facilitate the scaffolding in ITSs, by conducting a case study on guiding children to describe images for language learning. We construct different types of scaffolding tutoring systems grounded in four fundamental learning theories: knowledge construction, inquiry-based learning, dialogic teaching, and zone of proximal development. For qualitative and quantitative analyses, we build and refine a seven-dimension rubric to evaluate the scaffolding process. In our experiment on GPT-4V, we observe that LLMs demonstrate strong potential to follow pedagogical instructions and achieve self-paced learning in different student groups. Moreover, we extend our evaluation framework from a manual to an automated approach, paving the way to benchmark various conversational tutoring systems.
Troubles and Failures in Interactional Language. Towards a Linguistically Informed Taxonomy
It is one of the goals of this project to fill this gap by using theoretical models The goal of this talk is to introduce a systematic research agenda of language in interaction. Specifically, I propose to introduce a which aims to understand the nature of interaction between humans novel measure of comparison: the use of aspects of language that and artificial conversational agents (CA) (henceforth humanmachine are dedicated to regulating conversational interaction (henceforth interaction, HMI). Specifically, we shall take an explicit i-language).
Enabling Conversational Interaction with Mobile UI using Large Language Models
Wang, Bryan, Li, Gang, Li, Yang
Conversational agents show the promise to allow users to interact with mobile devices using language. However, to perform diverse UI tasks with natural language, developers typically need to create separate datasets and models for each specific task, which is expensive and effort-consuming. Recently, pre-trained large language models (LLMs) have been shown capable of generalizing to various downstream tasks when prompted with a handful of examples from the target task. This paper investigates the feasibility of enabling versatile conversational interactions with mobile UIs using a single LLM. We designed prompting techniques to adapt an LLM to mobile UIs. We experimented with four important modeling tasks that address various scenarios in conversational interaction. Our method achieved competitive performance on these challenging tasks without requiring dedicated datasets and training, offering a lightweight and generalizable approach to enable language-based mobile interaction.
Gaud\'i: Conversational Interactions with Deep Representations to Generate Image Collections
Bursztyn, Victor S., Healey, Jennifer, Vinay, Vishwa
Right: A mood-board created by a professional designer using Gaudí for the given project briefing: "You're designing a new ecofriendly, highend coffee brand that is notorious for its floral flavors." All images are from the BAM dataset [6]. Gaudí was developed to help designers search for inspirational images using natural language. In the early stages of the design process, designers will typically create thematic image collections called "mood-boards" (example shown in Figure 1) in order to elicit and clarify a client's preferred creative direction. Creating a mood-board involves sequential image searches which are currently performed using keywords or images. Gaudí transforms this process into a conversation where the user is gradually detailing the mood-board's theme. This representation allows our AI to generate new search queries from scratch, straight from a project's briefing, following a hypothesized mood. Previous computational approaches to this process tend to oversimplify the decision space, seeking to define it by hard coded qualities like dominant color, saturation and brightness [3, 2]. Recent advances in realistic language modeling (e.g., with GPT-3 [1]) and cross-modal image retrieval (e.g., with CLIP [5]) now allow us to represent image collections in a much richer semantic space, acknowledging richer variation in the stories designers tell when presenting a creative direction to a client.
Towards Teachable Conversational Agents
The traditional process of building interactive machine learning systems can be viewed as a teacher-learner interaction scenario where the machine-learners are trained by one or more human-teachers. In this work, we explore the idea of using a conversational interface to investigate the interaction between human-teachers and interactive machine-learners. Specifically, we examine whether teachable AI agents can reliably learn from human-teachers through conversational interactions, and how this learning compare with traditional supervised learning algorithms. Results validate the concept of teachable conversational agents and highlight the factors relevant for the development of machine learning systems that intend to learn from conversational interactions.